Introduction

Rolling Stone Magazine assembled 100 greatest artists across a broad span of musical genres and published the ranking list in 2011. This report attempts to explore the enduring engagement of these artists and their works at the end of 2023. To develop this, I will retrieve the artists’ Spotify metrics and YouTube video stats as my analytical dataset, and interpret the findings with data visualisations.

Data

  1. Rolling Stone’s 100 Greatest Artists ranking data

This project aims to extract the rankings and names of the top 100 artists by scraping Rolling Stone’s webpage. Since the full list is split between two pages, the task involves retrieving data from both pages and merging it into a single data frame with 2 columns and 100 rows.

  1. Spotify API

In this report, I will first fetch each artist’s Spotify ID using API calls, and then retrieve their follower counts and popularity scores. To clarify, the popularity index on Spotify ranges from 0 to 100, with higher scores signifying being more popular. This metric can serve as a robust indicator of an artist’s relevance in the current media landscape.

  1. YouTube channel data

Although the data provided by Spotify can interpret the general level of the artists’ enduring engagement, more detailed quantitative evidence and audience feedback is needed. Therefore, I will find their official YouTube channels to get the latest videos and their stats, including the views, comments, likes, favourites and comment texts. Due to daily API query limits, I will narrow my focus to a sample of 20 representative artists for this analysis.

The final analytical dataset for this project will include two tables, one of which will consist of 5 columns and 100 rows, and another will include 8 columns and 20 rows.

Anlysis

The first few rows of the final structured Spotify data can be found below. In this section, I will visualise the data to interpret the artists’ current endurance.

# Review the first few rows of the updated dataframe.
data <- read.csv("artists_spotify_data")
head(data)
##                 Name Rank                     id followers popularity
## 1        The Beatles    1 3WrFJ7ztbogyGnTHbHJFl2  26716086         83
## 2          Bob Dylan    2 74ASZWbe4lXaubB36ztrGX   6321698         70
## 3      Elvis Presley    3 43ZHCT0cAZBISjO8DG9PnE   8986764         81
## 4 The Rolling Stones    4 22bE4uQ6baNwSHPVcDxLCe  13574412         78
## 5        Chuck Berry    5 293zczrfYafIItmnmM3coR   1912216         72
## 6       Jimi Hendrix    6 776Uo845nYHJpNaStv1Ds4   6482298         68

(Note: all charts here are interactive, hover to view each detailed data.)

In the graph P1, the bar chart shows the number of followers of each artist on Spotify. An artist with more followers could be asserted to obtain a larger fan base and a stronger maintenance potential. Notably, the number of followers does not have a linear relationship with the rankings and varies extremely from one another. Artists such as Eminen, Queen, and The Beatles have a follower index that far exceeds any others. Eminen, especially, boasts nearly 80 million followers despite being ranked 83. But it is also evident that most artists’ fan base is under 10 million, with the lowest number of just 10 thousand, which is not exactly competitive in today’s music platforms.

The heat maps P2 and P2-2 indicate the Spotify popularity score for each artist, with warmer colours representing higher popularity index. To specify, P2-2 is rearranged from P2 in descending order of popularity, providing a clearer vision of the distribution of this value across the cohort. We can assert that the majority of these artists are still celebrated today, with 90% receiving a popularity score of over 50 and 26% over 75. Combined with the previous distribution of fanbase, we can find that even without a massive fanbase, these artists rated as the greatest continue to gain adoration. Such data suggests that their classics are still racking up enduring heat today, as in being added to playlists and looped by countless listeners.

To gather more quantitative metrics, I will use YouTube API to get the latest video data posted on the official channels of these artists. Here is an overview of the data frame.

# View the final data.
load("youtube_sample_data.RData")
head(video_data2)
##           Name
## 1  The Beatles
## 2 Jimi Hendrix
## 3   Bob Marley
## 4    Sam Cooke
## 5 Otis Redding
## 6  The Ramones
##                                                                                             VideoTitle
## 1 “We were trying to think of how to make certain sounds that weren’t available…” - George #RedandBlue
## 2                                                                                                Angel
## 3      Half the story has never been told. New ‘Bob Marley: One Love’ trailer out TOMORROW. #bobmarley
## 4                                                                         Touch The Hem Of His Garment
## 5                              Hard to Handle (Karaoke Version) (Originally Performed By Otis Redding)
## 6                                                   Ramones - She&#39;s The One (Official Music Video)
##       VideoID ViewCount CommentCount LikeCount FavoriteCount
## 1 i0azIt4O9aA     37594           61      4351             0
## 2 b-8Azzi9vMA    109998            7      3199             0
## 3 WytTz95Z6kc     58104          140      4335             0
## 4 YUTzto-T7P0        81            0         2             0
## 5 S8oMh_OWj8w       217            0         4             0
## 6 g7KrBgnjaLY   1401497          647     26600             0
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      Comments
## 1 'Revolver' is a really great album and George's track 'Taxman' (which opens side one of the LP) never fails to make me laugh due to it's witty lyrics. I bet all four of them had a lot of fun recording that one.\nTaxman the Song from  George,  Revolver Remix  the best 4 this Song. \nLg Ellen ✌❤\nThe Beatles espetacular, sensacional. The Beatles para sempre!\nAnd stealing the usage of horns, trumpets, riffs off of Frank Zappa’s first album ‘Freak Out,’ for  the making of Sgt. Pepper. \n\nIn the long run it didn’t matter. Frank got ‘em’ back on his 3rd album; ‘we’re only in it for the money.’ To name just one.  Oh No!\n\nAnd I love the Beatles…my favorite group as a youngster.\nTetszik\n"Taxman" by George Harrison is a great lovely song,one of his masterpiece.The Beatles were always inventing new sounds and creating .\nIts always dangerous when Tavistock controls you.\nUsed to gig tax man with a covers band about 25 years back great tune 👍\nThe truth is that you are and were geniuses\nOhh I love George’s voice. Ty and happy new year 🎉🎉🥂\nI don't know about you guys but having all these Beatles tapes on U tube sure does make me think that George and John are still with us and that is comforting for us old fans who grew up with them.  So thank you!\nIt's in the days of sixties.\nGetting dangerous!! George Harrison is very honest here. The Beatles had serious death threats in the USA in 1966..From what Ive heard these death threats were not made by looney cranks. They came from the top of officialdom and were taken so serious by those in the know, that the Beatles played their last US tour in summer 1966..After that they never toured again as Beatles.. John Lennon was shot dead in the USA in 1980,just as he was talking in the American press of touring again.. Then the American "Lone Nut" assassinated him. That "American Lone Nut" has become a joke amongst people who dig deeper.. That American Officialdom has a lot to answer for.. No no no. Officialdom got away with murder and not for the first time.\nTo me the “only studio” was the problem. They were great performing together. There would be no time to grow bad feelings amongst each other.\n🎉For all in Japan🎉\nOur Love Prayers for all of you🎉\nFor all #5 who perished in the firey plane crash God Bless🎉 your Families🎉\nA.Slone of ♥\nSongs 🎶 Great \nSad were missing George John🎉🎉🎉\n'LOVE🎉\nOk back to work meeting with my Fellows🎉🎉🎉\n\nBe kind in kind to each other🎉\nHappy New Year!🎉\nA. Slone of ♥\nEl gran George ❤\nThey used to play Eight days a week in Hamburg\n🔥\n❤
## 2                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            
## 3                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          Bob Marley experience your video on our channel go check it out!\nIf you were proud, I am proud for you and your family. I was not paying close attention when your father was alive, but I know he is a legend. I am paying close attention now. Congratulations on the success of this movie. I cannot wait to see it.\nOne love\nCan’t wait to see this and Ziggys thoughts really help . \nSo happy he was happy with his father’s portrayal. ❤ONE LOVE\nSkip Marley looks more sound more and more of bob marleys mannerisms\nFebruary 🎉,💯💯💯💯\n❤❤❤❤\n👁️🎶👁️🙏👁️✨🙏👁️‍🗨️🌞👁️💥👁️💛🇯🇲💚🤎🖤💥💜💙💥🤍🧡💛❣️💥💓🖤✨✨✨👁️❣️👁️🙋‍♀️\nJust the trailer made me tear up I know I’m gonna cry when I watch this 🥹♥️\nIt's about time. I'm glad, finally a legend movie. The world needs to see this 👌🏾\n❤\nFinally a movie about the king Bob Marley❤️💛💚One Love\nEndorsement of supreme 🎉🎉🎉\nHe should have played him. He looks exactly like Marley..😊❤\nBob marley king of Reggae one love RIP Africa loves you\nThis Film is Long Overdue and I can't wait to see it\nAncioso pra assistir o filme 😊\nNesta's life story February 14 UK  cannot wait IAH 🫶🫶🫶\nrastafari the people love robert nesta\nNao me canso de chorar só no trailer imagina no dia do filme
## 4                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            
## 5                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            
## 6                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   Thought Howard Stern was part of the band lol\nOne of the best!\nWhen Just and Rightous Gods busst the cork on Justice, to spill flood with Life and\nSaludos a Peter Andy y Lutz de la embajada Alemana (Año 1980) Los Punkies de Argentina de LANUS (Buenos Aires) Claudio Gabriel Varicela Mongo Marcelo Manners Jose Javier ❤❤❤❤❤🤩🤩🤩🤩🤩😎🔭\nits incredible how people still lisen to this incredible band many years later. its sad there not as admired how they should be\nThis track leans more to a Rock n Roll style and feel, with a punk beat. Great to hear. A classic never sounds old.\nCheck out the hardcore punk band Bad Brain.  Specifically their 2nd album "Rock for Light" which released in 1981. Dave Grohl hailed it as one of his favorite albums of all time.\nHow much rape have the English girls lived to remove Goñs.\n\n\nFxck me Jesus.\n\nRight.\n\nSo (Seriously) Sorry English Girls.\nMagical time to be alive...\nHoward stern\nSimple lyrics chords and Riffs  Steady Beats and the \n                Same wardrobe  !\nSOMEBODY STOLE MY FUCKIN' WORDS. DAMN, FREAK, BITCH AND LOSER. GO TO HELL AND NEVER COME BACK.\nCan't believe all of them are gone 💔 but the name still remains alive\nForce of Nature\nOne of my favorite songs in high school. My best friend had a cassette player and made copies for all of us .. those were the days.\n💛🍀➕️🖤💬🚫1️⃣🎲\nThey should've called it Disco High but kept Ramones\nWTF? The emperor has no clothes.\nExtremely overrated band ever.\nザ ロッカーズのモーターサイクルは、ほぼこれとメロディ一緒やな😂\nKILLER!!!)'(

The bar charts below illustrate the statistics of the latest YouTube video of each sample artist. Comparing the two graphs we can find that the distribution of likes and views is roughly in the same trend, with The Ramones and Madonna ranking the highest. Like previous Spotify follower statistics, only a handful of artists are enjoying high streaming flow for their work now. However, this doesn’t mean they’re being forgotten, as the final word cloud derived from the text of these video comments shows, they’re still being “loved”.

In conclusion, although the Greatest 100 Artists may not be as widely listened to as in the past, they continue to hold a special place in the musical landscape, with their works transcending time and space, underscoring their enduring impact in the world of music.

Appendix: All code in this assignment

knitr::opts_chunk$set(warning = FALSE)
knitr::opts_chunk$set(message = FALSE)
knitr::opts_chunk$set(echo = FALSE) 

# Load packages
library(rvest)
library(xml2)
library(dplyr)
library(spotifyr)
library(jsonlite)
library(httr)
library(tidyverse)

## Step 1: scrape the 100 greatest artists data from Rolling Stone webpage.======

# Scrape the first page's 50 artists.
# Navigate to the webpage.
website <- "https://www.rollingstone.com/music/music-lists/100-greatest-artists-147446/"

# Read the page's html.
res <- read_html(website)

# Get the 51-100 artists' data.
rank_51_100 <- res %>%
  html_nodes("h2") %>%
  html_text() %>%
  as.data.frame() %>%
  head(50)
colnames(rank_51_100) <- "Name" #Rename the column.
rank_51_100$Rank <- 100:51 #Add ranking numbers.

# Navigate to the next page which contains 1-50 artists' data.
website2 <- "https://www.rollingstone.com/music/music-lists/100-greatest-artists-147446/the-band-2-88489/"
res2 <- read_html(website2)
rank_1_50 <- res2 %>%
  html_nodes("h2") %>%
  html_text() %>%
  as.data.frame() %>%
  head(50)
colnames(rank_1_50) <- "Name" #Rename the column.
rank_1_50$Rank <- 50:1 #Add ranking numbers.

# Combine two data frames into one single table.
greatest_100_artists <- bind_rows(rank_51_100, rank_1_50) %>%
  arrange(Rank)

## Step 2: retrieve more details for analysis from Spotify API.================

# Set up my Spotify client id and secret by reading from a local ".env" file.
readRenviron("E:/文件/LSE/MY472/spotify id&secret.env")
id <- Sys.getenv("id")
secret <- Sys.getenv("secret")
Sys.setenv(SPOTIFY_CLIENT_ID = id)
Sys.setenv(SPOTIFY_CLIENT_SECRET = secret)

# Set the access token.
access_token <- get_spotify_access_token()

# Create a function to retrieve an artist's spotify id from api.
get_spotify_id <- function(artist_name) {
  # To avoid timeout issues, I create a retry mechanism system here, which contains a loop combined with tryCatch and Sys.sleep to make repeated attempts at the API call.
  attempt <- 1 # Set the minimum number of retry attempts.
  max_attempts <- 5 # Set the maximum number of retry attempts.
  # Write a loop to retrieve data while avoiding timeout errors.
  while (attempt <= max_attempts) {
    try({
      # Set up the api endpoint.
      search_url <- paste0('https://api.spotify.com/v1/search?q=', URLencode(artist_name), '&type=artist&limit=1')
      # Get response.
      response <- GET(search_url, timeout(30), add_headers(Authorization = paste('Bearer', access_token)))
      if (status_code(response) == 200) {
        search_results <- content(response, as = "parsed", type = "application/json")
        # Here I add an if sentence to deal with potential NA values.
        if (length(search_results$artists$items) > 0) {
          return(search_results$artists$items[[1]]$id)
        } else {
          return(NA)
        }
      }
    }, silent = TRUE)
    attempt <- attempt + 1
    Sys.sleep(5)  # pause for 5 seconds before retrying
  }
  return(NA)
}

# Improve the function into another function that retrieve a list of artists' spotify id.
get_ids <- function(rank) {
  for (i in 1:nrow(rank)) {
    name <- rank$Name[i]
    rank$id[i] <- get_spotify_id(name)
  }
  return(rank)
}

# Get all the artists' spotify ids and add them into my dataframe.
greatest_100_artists <- get_ids(greatest_100_artists)

# Retrieve followers and popularity index.====================================================

# Function to get an artist's followers and popularity index.
get_artist_details <- function(id, token) {
  # Set the url endpoint.
  url <- paste0("https://api.spotify.com/v1/artists/?ids=", id)
  
  # To avoid timeout issues mentioned before, here I create a retry mechanism as well.
  attempt <- 1 # Set the minimum number of retry attempts.
  max_attempts <- 5  # Set the maximum number of retry attempts.
  while (attempt <= max_attempts) {
    response <- tryCatch({
      # Get API response.
      GET(url, timeout(30), add_headers(`Authorization` = paste("Bearer", token)))
    }, error = function(e) { NULL })
    
    # Check if the request was successful
    if (!is.null(response) && status_code(response) == 200) {
      # Retrieve text details from a json file we got.
      details <- fromJSON(content(response, "text", encoding = "UTF-8"))
      # Extract the followers and popularity data.
      followers <- details$artists$followers$total
      popularity <- details$artists$popularity
      
      # Create a new data frame to storage the retrieved data.
      return(data.frame(followers = followers, popularity = popularity))
    } else {
      message(sprintf("Attempt %d failed. Retrying in %d seconds...", attempt, attempt))
      Sys.sleep(attempt)  # Exponential back-off
      attempt <- attempt + 1
    }
  }
  # If all attempts failed, output warning message and return NA values.
  warning("All attempts failed.")
  return(data.frame(followers = NA, popularity = NA))
}

# Mapping the artists dataframe to add followers and popularity details.
for (i in 1:nrow(greatest_100_artists)) {
  id <- greatest_100_artists$id[i]
  df <- get_artist_details(id, access_token)
  greatest_100_artists$followers[i] <- df$followers[1]
  greatest_100_artists$popularity[i] <- df$popularity[1]
}

# Write this data into my local storage for future usage.
write.csv(greatest_100_artists, "artists_spotify_data", row.names = FALSE)
data <- read.csv("artists_spotify_data")
# Review the first few rows of the updated dataframe.
data <- read.csv("artists_spotify_data")
head(data)
## Visualisation ==========================================================
# Load "plotly" package for generating interactive plots.
library(plotly)
library(ggplot2)
#install.packages("highcharter")
library(highcharter)

# Convert the Name column to a factor and keep the original order.
data$Name <- factor(data$Name, levels = unique(data$Name))

# Use highcharter to create an interactive bar chart.
hchart(data, "bar", hcaes(x = Name, y = followers)) %>%
  hc_title(text = "P1. The Spotify Followers of the 100 Greatest Artists") %>%
  hc_xAxis(labels = list(style = list(fontSize = '8px')))
# Create a heat map to show the popularity of all artists.
data$hover_info2 <- paste("Name:", data$Name, "<br>Popularity:", data$popularity, "<br>Rank:", data$Rank) # Set the hover information.
heatmap <- ggplot(data, aes(x = Name, y = 1, fill = popularity, text = hover_info2)) +
  geom_tile(color = "white") + # Create a tile plot.
  labs(title = "P2. The Spotify Popularity Score of the 100 Greates Artists") +
  theme_minimal() + # Set up the theme.
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1, size = 6), # Set the text of the x axis.
        axis.text.y = element_blank(), # Hide the text of the y axis.
        axis.ticks.y = element_blank(), # Hide the scale on the y axis of the chart.
        axis.title.y = element_blank(), # Hide the title of the y axis.
        axis.title.x = element_blank(), # Hide the title of the x axis.
        panel.grid = element_blank()) + # Hide the grid.
  scale_fill_gradientn(colors = c("white", "lightyellow", "red")) # Specify a gradient fill. 
ggplotly(heatmap, tooltip = "text") # Display the interactive plot.
# Create some popularity thresholds for analysis.
data_reverse <- data[order(-data$popularity),] # Reorder the artists data.
data_reverse$Name <- factor(data_reverse$Name, levels = data_reverse$Name) # Convert the Name column into a factor.
divider_position <- which(data_reverse$popularity == 50)[1] # Find the first threshold where popularity score equals to 50.
divider_position2 <- which(data_reverse$popularity == 75)[2] # Find another threshold where popularity score equals to 75.

# Create another heat map to show the popularity of all artists after sorting in reverse order of the popularity index.
# This is done for comparing the differences between the rankings and popularity of all artists.
heatmap_compare <- ggplot(data, aes(x = reorder(Name, -popularity), y = 1, fill = popularity, text = hover_info2)) +
  geom_tile() + # Create a tile plot.
  geom_vline(xintercept = divider_position, color = "darkblue", linetype = "dashed", linewidth = 0.3) +
  geom_vline(xintercept = divider_position2, color = "darkblue", linetype = "dashed", linewidth = 0.3) +
  labs(title = "P2-2. The Spotify Popularity Score of the 100 Greates Artists", x = "Name", y = "", fill = "Popularity") + # Set the labs.
  # Add annatations.
  annotate("text", label = "Popularity score: 50", x = divider_position, y = 1, angle = 90, vjust = -0.5, color = "darkblue", size = 3) + 
  annotate("text", label = "Popularity score: 75", x = divider_position2, y = 1, angle = 90, vjust = -0.5, color = "darkblue", size = 3) +
  annotate("text", label = "26%", x = divider_position2/2, y = 1.25, color = "darkgreen") +
  annotate("text", label = "64%", x = (divider_position+divider_position2)/2, y = 1.25, color = 'darkgreen') +
  annotate("text", label = "10%", x = (divider_position+100)/2, y = 1.25, color = 'darkgreen') +
  theme_minimal() + # Set the theme.
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1, size = 6), # Set the text of the x axis.
        axis.text.y = element_blank(), # Hide the text of the y axis.
        axis.ticks.y = element_blank(), # Hide the scale on the y axis of the chart.
        axis.title.y = element_blank(), # Hide the title of the y axis.
        axis.title.x = element_blank(), # Hide the title of the x axis.
        panel.grid = element_blank()) + # Hide the grid.
  scale_fill_gradientn(colors = c("white", "lightyellow", "red")) # Specify a gradient fill.
ggplotly(heatmap_compare, tooltip = "text") # Display the interactive plot.

# Load the packages.
#library(httr)
#library(jsonlite)

# Set up my YouTube data API key.
readRenviron("E:/文件/LSE/MY472/youtube api.env")
youtube_key <- Sys.getenv("KEY")

# Create a function to fetch YouTube channel ID
search_youtube_channel <- function(artist_name, api_key) {
  base_url <- "https://www.googleapis.com/youtube/v3/search"
  query <- list(part = "snippet", q = artist_name, type = "channel", key = api_key)
  
  response <- GET(url = base_url, query = query)
  data <- content(response, "parsed")
  # Handle the NA values.
  if (length(data$items) > 0) {
    return(data$items[[1]]$snippet$channelId)
  } else {
    return(NA)
  }
}

# Get the channel ids by searching artists' names.
channel_ids <- lapply(data$Name, search_youtube_channel, youtube_key)
ids <- unlist(channel_ids)
yt_channel <- data.frame(names = data$Name, ids = ids)

# Non-return sampling.
# Generate a sampling index (start at 1 and draw every 4).
sampling_indices <- seq(1, nrow(yt_channel), by=5)
# Use the index to select from the sample.
selected_samples <- yt_channel[sampling_indices, ]

# Store the sampled table as a local file. Because the YouTube API has a ration for tokens everyday.
write.csv(selected_samples, "selected_samples", row.names = FALSE)
sampled_channel <- read.csv("selected_samples")

# Write a function to get video statistics.
get_video_statistics <- function(video_id, api_key) {
  stats_base_url <- "https://www.googleapis.com/youtube/v3/videos"
  stats_params <- list(
    part = "statistics",
    id = video_id,
    key = api_key
  )
  stats_response <- GET(url = stats_base_url, query = stats_params)
  stats_content <- fromJSON(rawToChar(stats_response$content), flatten = TRUE)
  # Handle potential NA values.
  if (length(stats_content$items) == 0) {
    return(list(
      ViewCount = NA,
      CommentCount = NA,
      LikeCount = NA,
      FavoriteCount = NA
    ))
  }
  
  # Ensure that each statistic is available, else NA. And collect the view counts, comment counts, like counts and favourite counts.
  view_count <- ifelse(!is.null(stats_content$items$statistics.viewCount), stats_content$items$statistics.viewCount, NA)
  comment_count <- ifelse(!is.null(stats_content$items$statistics.commentCount), stats_content$items$statistics.commentCount, NA)
  like_count <- ifelse(!is.null(stats_content$items$statistics.likeCount), stats_content$items$statistics.likeCount, NA)
  fav_count <- ifelse(!is.null(stats_content$items$statistics.favoriteCount), stats_content$items$statistics.favoriteCount, NA)
  
  # Structure the retrieved data as a data frame.
  video_stats <- data.frame(Like_Count = like_count,
                            View_Count = view_count,
                            Comment_Count = comment_count,
                            Fav_Count = fav_count)
  
  return(video_stats)
}

# Write a function to get the latest videos and their statistics.
get_latest_videos <- function(channel_id, api_key) {
  base_url <- "https://www.googleapis.com/youtube/v3/search"
  # Set the parameters.
  params <- list(
    part = "snippet",
    channelId = channel_id,
    maxResults = 1,
    order = "date",
    type = "video",
    key = api_key
  )
  response <- GET(url = base_url, query = params)
  content <- fromJSON(rawToChar(response$content), flatten = TRUE)
  
  # Handle potential errors.
  if (!"items" %in% names(content)) {
    stop("No items found in API response.")
  }
  
  # Get the video's id, title, and statistics.
  video_id <- content$items$id.videoId
  video_title <- content$items$snippet.title
  statistics <- get_video_statistics(video_id[1], api_key)
  
  # Create a new data frame as final output.
  video_details <- data.frame(
    Name = sampled_channel$names[which(sampled_channel$ids == channel_id)],
    VideoTitle = video_title,
    VideoID = video_id,
    ViewCount = statistics$View_Count,
    CommentCount = statistics$Comment_Count,
    LikeCount = statistics$Like_Count,
    FavoriteCount = statistics$Fav_Count,
    stringsAsFactors = FALSE
    )
  
  return(do.call(cbind, video_details))
}

# Create a blank data frame for storage.
video_data2 <- data.frame()
# Fetch the latest video for each artist
for (i in 1:nrow(sampled_channel)) {
  channel_id <- sampled_channel$ids[i]
  videos2 <- get_latest_videos(channel_id, youtube_key)
  # Process the videos data.
  video_data2 <- rbind(video_data2, videos2)
}

# Create a function to get the first ten comments of each video.
get_comments <- function(video_id, api_key) {
  url <- paste0("https://www.googleapis.com/youtube/v3/commentThreads?key=", api_key,
                "&textFormat=plainText&part=snippet&videoId=", video_id,
                "&maxResults=20")  # Request for the top 20 comments.
  response <- GET(url)
  content <- fromJSON(rawToChar(response$content))
  
  # Extract the comments.
  comments <- content$items$snippet$topLevelComment$snippet$textDisplay
  combined_comments <- paste(comments, collapse = "\n")
  return(combined_comments)
}

# Create a list for storage.
Comments <- list()
# Get the top few comments for each video.
for (i in 1:nrow(video_data2)) {
  comments <- get_comments(video_data2$VideoID[i], youtube_key)
  Comments[i] <- comments
}
video_data2$Comments <- Comments

# Write the data to local directions.
save(video_data2, file = "youtube_sample_data.RData")
# View the final data.
load("youtube_sample_data.RData")
head(video_data2)
library(tidyverse)
transformed_data <- video_data2 %>%
  pivot_longer(cols = c(LikeCount, CommentCount),
               names_to = "Statistics")
transformed_data$value <- as.numeric(transformed_data$value)
hchart(transformed_data, "bar", hcaes(x = Name, y = value, group = Statistics)) %>%
  hc_title(text = "P3. Statistics of the Latest Videos of Sample Artists")

video_data2$ViewCount <- as.numeric(video_data2$ViewCount)
hchart(video_data2, "bar", hcaes(x = Name, y = ViewCount)) %>%
  hc_title(text = "P3-2. Sample Artist's Latest Video Views")

#install.packages("tm")
#install.packages("wordcloud")
library(tm) # Use the "tm" (Text Mining) package to process the text.
library(wordcloud) # Use this package to generate a wordcloud.

# Prepare the Text Data
comments_text <- paste(video_data2$Comments, collapse=" ")

# Create a Text Corpus and Clean the Data
corpus <- Corpus(VectorSource(comments_text))
corpus <- tm_map(corpus, content_transformer(tolower))
corpus <- tm_map(corpus, removePunctuation)
corpus <- tm_map(corpus, removeWords, stopwords("english"))

wordcloud(corpus, scale=c(5,0.6), max.words=80, random.order=FALSE, rot.per=0, colors=brewer.pal(8, "Set1"))
# this chunk generates the complete code appendix. 
# eval=FALSE tells R not to run (``evaluate'') the code here (it was already run before).